Introduction to BIT Chinese Spelling Correction System at CLP 2014 Bake-off
نویسندگان
چکیده
This paper describes the Chinese spelling correction system submitted by BIT at CLP Bake-off 2014 task 2. The system mainly includes two parts: 1) N-gram model is adopted to retrieve the non-words which are wrongly separated by word segmentation. The non-words are then corrected in terms of word frequency, pronunciation similarity, shape similarity and POS (part of speech) tag. 2) For wrong words, abnormal POS tag is used to indicate their location and dependency relation matching is employed to correct them. Experiment results demonstrate the effectiveness of our system.
منابع مشابه
Extended HMM and Ranking Models for Chinese Spelling Correction
Spelling correction has been studied for many decades, which can be classified into two categories: (1) regular text spelling correction, (2) query spelling correction. Although the two tasks share many common techniques, they have different concerns. This paper presents our work on the CLP-2014 bake-off. The task focuses on spelling checking on foreigner Chinese essays. Compared to online sear...
متن کاملNCTU and NTUT's Entry to CLP-2014 Chinese Spelling Check Evaluation
This paper describes our Chinese spelling check system submitted to SIGHAN Bake-off 2014 evaluation. The system’s main components are still the conditional random field (CRF)-based word segmentation/part-ofspeech (POS) tagger and tri-gram language model (LM) used last year. But we tried to refine the misspelling rules, decision-making threshold and improve LM rescoring speed to reduce false ala...
متن کاملNTOU Chinese Spelling Check System in CLP Bake-off 2014
This paper describes details of NTOU Chinese spelling check system participating in CLP2014 Bakeoff. Confusion sets were expanded by using two language resources, Shuowen and Four-Corner codes. A new method to find spelling errors in legal multi-character words was proposed. Comparison of sentence generation probabilities is the main information for error detection and correction. A rulebased c...
متن کاملChinese Spelling Check Evaluation at SIGHAN Bake-off 2013
This paper introduces an overview of Chinese Spelling Check task at SIGHAN Bake-off 2013. We describe all aspects of the task for Chinese spelling check, consisting of task description, data preparation, performance metrics, and evaluation results. This bake-off contains two subtasks, i.e., error detection and error correction. We evaluate the systems that can automatically point out the spelli...
متن کاملChinese Spelling Check System Based on Tri-gram Model
This paper describes our system in the Chinese spelling check (CSC) task of CLP-SIGHAN Bake-Off 2014. CSC is still an open problem today. To the best of our knowledge, n-gram language modeling (LM) is widely used in CSC because of its simplicity and fair predictive power. Our work in this paper continues this general line of research by using a tri-gram LM to detect and correct possible spellin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014